Diversity of bacterial community in the rhizosphere and bulk soil of Artemisia annua grown in highlands of Uganda

High land areas in Uganda are suitable for the farming of Artemisia annua. However, harvested A. annua from these areas contain varying concentrations of antimalarial components. This may be attributed to variation in soil properties which affect vegetative growth characters, yield and active compounds of A. annua. Thus, bacterial composition and physiochemical properties of soil from Kabale and Kabarole high land areas where A. annua is grown were studied. The study objective was to determine the diversity of bacterial community in the rhizosphere and bulk soil of A. annua grown in highlands of Uganda. Composition of bacterial community was analyzed by amplicon sequencing of 16S rRNA genes on an Illumina Miseq platform. A total of 1,420,688 read counts was obtained and clustered into 163,493 Operational Taxonomic Units ((OTU). Kabarole highland had more OTUs (87,229) than Kabale (76,264). The phylum Proteobacteria (34.2%) was the most prevalent followed by Acidobacteria (17.3%) and Actinobacteria (15.5%). The bacteria community in the two highlands significantly differed (p <0.05) among all phyla except Proteobacteria. The main genera in bulk soil were povalibacter, brevitalea, nocardioides, stenotrophobacter, gaiella and solirubrobacter. Sphingomonas, ramlibacter paludibaculum and pseudarthrobacter were the main genera in A. annua rhizospheric soil.


Introduction
Artemisia annua is widely grown in different parts of the world as a cheap source of the antimalarial compounds such as artemisinin, flavonoids, aromatic oils and polysaccharides [1,2]. In Uganda, A. annua was introduced around 2003 [3] and is mainly grown in Wakiso, Kaberamaido, Kapchorwa, Rukungiri, Kabarole and Kabale districts. Itis cultivated as a monocrop or intercropped with beans. The main districts where the crop is grown include. The content of active compounds varies greatly depending on the geographical location. The highland areas produce A. annua with more active compound than low land areas [3,4]. For instance, they [3] observed that artemisinin and total flavonoids levels were higher in samples obtained from high land areas (Kabarole and Kabale) compared to those obtained from lowland regions (wakiso) i.e. 0.8%, 0.5% Vs 0.4% and 2.6%, 2.55% Vs 1.5% respectively. However, the artemisinin concentration is very low compared to other parts of the world that produce A. annua with upto 2% artemisinin [5]. Improving the concentrations of the antimalarial compounds in A. annua grown in Uganda is therefore an important area of investigation. Various plant growth promoting bacteria (PGPB) such as Azotobacter, Azospirillum, Bacillus and Pseudomonas have been reported to increase the concentration of artemisinin [6,7] elsewhere. However, there is no study that highlights the rhizobacterial community of A. annua yet understanding the rhizobacterial community of a given plant species is a vital when considering the use of rhizobacteria as plant growth promoters [8]. This study therefore aimed at profiling the diversity of bacterial community in the rhizosphere and bulk soil of A. annua grown in highlands of Uganda as basis for the use of microbial inoculants to enhance its antimalarial compounds.

Study sites and sample collection
The study was conducted in 2 highlands of Uganda i.e. Kabale (South Western Uganda) and Kabarole (Western Uganda). These are areas producing large volumes of artemisinin ranging from 0.5 and 0.8% [3]. The altitude of Kabarole and Kabale district is 1300-3800 meters and 2,000 meters above sea level respectively. Kabarole and Kabale district receive annual rainfall ranging from 1,200 mm-1,500 mm and 800 mm-1000 mm respectively [9,10]. Soil samples were collected at the time of harvesting A. annua. Rhizosphere and bulk soils were sampled from the four existing cropping types of (i) intercrop of A. annua and beans (AA+B), (ii) beans alone (B), (iii) A. annua alone (A) and iv) Control-No crop grown, (N AA/B)). For the physicochemical analysis, composite samples (each 1.5 kg) were picked from the top soil (0-15 cm). For each of the 4 cropping type, 4 farms in each of the 2 districts having similar treatments were considered as replicates, thus bringing the total to 32 samples. The composite sample consisted of 10 cores obtained using zigzag technique. All 32 samples were placed in clean labelled sealable plastic bags and transported to the laboratory with in a day of collection. The soil was dried and sieved using a 2 mm sieve. For DNA extraction and molecular analyses, samples were put in collecting tubes (2.2g each) and were stored at −80˚C. For each of the 4 cropping types, 4 farms in each of the 2 districts having similar treatments were considered as replicates and in each replicate 4 plants were selected to obtain rhizospheric sample (where A. annua plants were grown) or 4 cores were selected for obtaining bulk samples (where there was no A. annua).
was conducted to detect the presence of gDNA. Later, the concentration of gDNA was quantified using Qubit 3 Fluorometer (Singapore) and an average of 6.8 ng/μl was obtained. Thereafter, 14 μls of gDNA was sent to Macrogen for amplicon based metagenomics sequencing.
Preprocessing and clustering. This was carried out using two programs i.e. CD-HIT-OTU MiSeq/FLX [18] and rDNA tools-PACBIO [19]. The steps involved identifying of chimeric reads and removing them, filtering out of short reads, trimming of extra-long tails. Filtered reads were clustered at 100% identity using CD-HIT-DUP. Secondary clusters were recruited into primary clusters. Noise sequences in clusters of size X or below were removed and X was statistically calculated. Remaining representative reads from non-chimeric clusters were clustered using a greedy algorithm into OTUs at a user-specified OTU cut off (e.g. 97% ID at species level). The results of preprocessing are summarized in Table 1.
Taxonomic assignment and diversity statistics. This was carried out using the program QIIME [20]. A reasonable number of reads was used in analysis since the rarefaction curves for the various samples became flatter to the right. Representative sequences from each OTUs were used to assign taxonomy. Furthermore, to identify differences between various treatments and the two study sites, data was analyzed statistically by one-way analysis of variance and independent student t-test (p<0.05) using SPSS 21.0 Software (SPSS, Chicago, IL, USA). In addition, Principal Component Analysis (PCA) was conducted to find the lowest number of factors which could account for the variability in the original variables that were associated with those factors.
Genera abundance. A total of 626 bacterial genera (abundance � 0.01%) were observed among the 3 most prevalent phylum (proteobacteria, acidobacteria and actinobacteria) common to both districts. However, PCA was conducted on 31 genera species that showed higher abundance (� 0.2%) and the physiochemical properties of soil. The suitability of data for factor analysis was assessed by running the correlation matrix which revealed the presence of many coefficients of 0.3 and above. Furthermore, since variables were of various units (%, mg/l), they were first normalized [21] in order to bring the values of the different variables within the comparable range. This was done by subtracting the mean from the observed value and dividing by the standard deviation for each 42 variables using the following formula.

Normalized value ¼
Observed value À Mean Standard deviation Having standardized the data, weights were attached using Principal Component Analysis (PCA) for all the soil properties in STATA 15 Statistical Software for assigning the weights. The loadings from the first component of PCA are used as the weights for the indicators. The assigned weights varied between -1 and +1, sign with the magnitude of the weights describing the contribution of each variable to the overall value of the index for soil properties. Two statistical tests were first conducted in order to determine the suitability of PCA. First, the Kaisers-Meyer-Olkin (KMO) measure of sampling adequacy score whose value of 0.811 was above the minimum recommended level of 0.60 [22] and Bartlett's Test of sphericity [23] was 0.000<0.01 implying that it was statistically significant thus supporting the factorability of the correlation matrix. Determination of the number of components to retain was based on Eigen value greater than 1.0 and the scree plot. Principal component Analysis revealed the presence of nine components with eigenvalues exceeding 1 and explained 88.32% as a cumulative of the variance. However, five components were first used in analysis and the components that contributed smaller amounts of the total variance were left out. Then finally, using Catell's Scree plot [24], it was decided to retain two components for further investigation. The two-component solution explained a total of 56.34% of the variance, with Component 1 contributing most variance (47.61%) and Component 2 explaining the least (8.73%). To aid in the interpretation of these two components, varimax rotation was performed. Using information about how much of the variance is explained by each item (the communalities table), soil components that had small values less than 0.3 and thus failed to load on the components obtained, an indication that they did not fit well with other items in its component were discarded. Furthermore, a Rotated Factor Matrix table was constructed to tell us what the factor loadings looked like after rotation.

Physiochemical properties
Results obtained from analysis of physical and chemical properties of soil are presented in Table 2. The soils in Kabale are clay loam and have neutral pH (6.64) while Kabarole soil are sandy loam and have slightly alkaline pH (7.48). The content of Nitrogen, Potassium, Sodium

Phyla abundance
The phylum Proteobacteria was the most prevalent followed by Acidobacteria and Actinobacteria (Fig 1A and 1B). Comparing the two districts, Kabale soils had higher diversity of bacteria than Kabarole and there was significant difference (P�0.5) among all phyla except phylum Proteobacteria. Kabale soil samples contained more Acidobacteria, Bacteroidetes, Veruccomicrobia, Gemmatimonadetes and Firmicutes while Kabarole soil samples contained more Actinobacteria and other phyla like Plactomycetes. With respect to the various soil treatments, there was no significant difference (P� 0.05) among the abundance of various phyla in Kabale. However, in Kabarole, there was significant differences (P� 0.05) in the abundance of phyla actinobacteria (14.85%) and verrucomicrobia (7.01%) in soils with artemisia only and the other treatments. Comparing bulk and rhizospheric soils, Proteobacteria were dorminant in both soils but acidobacteria and actinobacteria were more dorminate in bulk soils of Kabale and Kabarole respectively. Genus abundance. The abundance of various bacterial genera (� 0.2%) in the bulk and rhizospheric soil of the most prevalent phylum (Proteobacteria, acidobacteria and actinobacteria) are shown in Table 3. Most genera in Kabale soils did not show any significant differences. However, most genera in Kabarole soils showed significant differences. In bulk soil, the genera that showed significant differences were po, br, no, str, ga and sol. In rhizospheric soils, Many genera (sps, ac, ram, ma, rhi, aci, ed, si, oc, pa and pse) showed significant differences with the bulk soil but genera sps, ram, pa and pse were observed to show significant in differences in both Kabale and Kabarole soils. Results of PCA of various soil components (genera and physiochemical properties) are shown in Fig 2. Seven genera (sps, Lu, he, st, ma, spm and pse) had positively higher loading with the second component. Twenty nine soil components had higher loading with the first component and 18 (sand, Ca, Mg, P, N, OCa, pH, Sol, ga, Str, Vi, acis, pe, az, ram, ral, P and Ly) had positive factor loading while 11 (ac, ni, ps, aci, rhi, ed, si, oc, pa, pa_a and clay) had negative factor loading.

Discussion
The most abundant phylum of the A. annua rhizospheric bacterial community were Gemmatimonadetes, Acidobacteria and Proteobacteria. With the exception of acidobacteria, the results varied with what was reported by [25] as they observed Chloroflexi, Cyanobacteria, and Planctomycetes as the most abundant in A. annua rhizosphere soil yet in this study, they were the least abundant. The variation may be stemming from the fact that plants were of different varieties and were also grown in different soil types at different altitudes.
Proteobacteria have been reported to be dorminant in nutrient rich soil (copiotrophic) while acidobacteria are dominant in oligotrophic conditions [26] and in soils with lower pH [27]. Thus the results obtained tallied with what has been reported. Both bulk and rhizospheric soils had copiotrophic conditions as the ratio of proteobacteria to acidobacteria was high [26] and also the nutrients were high (Table 2). Furthermore, bulk soils of Kabale had lower physiochemical properties and thus were expected to have more acidobacteria than rhizospheric soils and this was what was observed. On the other hand, more actinobacteria was observed in slightly alkaline Kabarole sandy loam soils that were rich in organic matter than in neutral Kabale soils, this observation tallied with reports of [28,29].

Conclusion
In conclusion, the results show that the A. annua rhizosphere is a large reservoir of bacteria that may be capable of many roles. Most of the species observed in the rhizosphere are not among the most frequently mentioned non symbiotic PGPB (Azospirillum sp., Azotobacter sp., Bacillus sp., Pseudomonas sp. etc) mentioned in various reports. However, many of the species belong to the phylum proteobacteria which constitutes most PGPB. Thus, use of selected bacteria especially proteobacteria may promote A. annua growth and increase its phytochemical contents.